On Using Written Language Training Data for Spoken Language Modeling

نویسندگان

  • Richard M. Schwartz
  • Long Nguyen
  • Francis Kubala
  • George Chou
  • George Zavaliagkos
  • John Makhoul
چکیده

We attemped to improve recognition accuracy by reducing the inadequacies of the lexicon and language model. Specifically we address the following three problems: (1) the best size for the lexicon, (2) conditioning written text for spoken language recognition, and (3) using additional training outside the text distribution. We found that increasing the lexicon 20,000 words to 40,000 words reduced the percentage of words outside the vocabulary from over 2% to just 0.2%, thereby decreasing the error rate substantially. The error rate on words already in the vocabulary did not increase substantially. We modified the language model training text by applying rules to simulate the differences between the training text and what people actually said. Finally, we found that using another three years' of training text even without the appropriate preprocessing, substantially improved the language model We also tested these approaches on spontaneous news dictation and found similar improvements. 1. I N T R O D U C T I O N Speech recognition accuracy is affected as much by the language model as by the acoustic model. In general, the word error rate is roughly proportional to the square root of the perplexity of the language model. In addition, in a natural unlimited vocabulary task, a substantial portion of the word errors come from words that are not even in the recognition vocabulary. These out-of-vocabulary (OOV) words have no chance of being recognized correctly. Thus, our goal is to estimate a good language model from the available training text, and to determine a vocabulary that is likely to cover the test vocabulary. The straightforward solution to improving the language model might be to increase the complexity of the model (e.g., use a higher order Markov chain) and/or obtain more language model training text. But this by itself will not necessarily provide a better model, especially if the text is not an ideal model of what people will actltally say. The simple solution to increase the coverage of the vocabulary is to increase the vocabulary size. But this also increases the word error rate and the computation and size of the recognition

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On-Line Learning of a Persian Spoken Dialogue System Using Real Training Data

The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (g) on the learning speed...

متن کامل

On-Line Learning of a Persian Spoken Dialogue System Using Real Training Data

The first spoken dialogue system developed for the Persian language is introduced. This is a ticket reservation system with Persian ASR and NLU modules. The focus of the paper is on learning the dialogue management module. In this work, real on-line training data are used during the learning process. For on-line learning, the effect of the variations of discount factor (g) on the learning speed...

متن کامل

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

Language modeling for speech recognition of spoken Cantonese

This paper addresses the problem of language modeling for LVCSR of Cantonese spoken in daily communication. As a spoken dialect, Cantonese is not used in written documents and published materials. Thus it is difficult to collect sufficient amount of written Cantonese text data for the training of statistical language models. We propose to solve this problem by translating standard Chinese text,...

متن کامل

Adult’s Learning Strategies for Receptive Skill Self-managing or Teacher-managing

Receptive language skill refers to answering appropriately to another person's spoken language. A lot of teachers try to develop receptive language skills in their language learners. When receptive language skills are not appropriately acquired, learners may miss significant learning opportunities resulting in delays in the development and acquisition of spoken language. The goals of this paper...

متن کامل

Tagging a Corpus of Spoken Swedish

In this article, we present and evaluate a method for training a statistical partof-speech tagger on data from written language and then adapting it to the requirements of tagging a corpus of transcribed spoken language, in our case spoken Swedish. This is currently a significant problem for many research groups working with spoken language, since the availability of tagged training data from s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994